Keyword Search Result

[Keyword] Markov model(95hit)

21-40hit(95hit)

  • An Extension of Separable Lattice 2-D HMMs for Rotational Data Variations

    Akira TAMAMORI  Yoshihiko NANKAKU  Keiichi TOKUDA  

     
    PAPER-Pattern Recognition

      Vol:
    E95-D No:8
      Page(s):
    2074-2083

    This paper proposes a new generative model which can deal with rotational data variations by extending Separable Lattice 2-D HMMs (SL2D-HMMs). In image recognition, geometrical variations such as size, location and rotation degrade the performance. Therefore, the appropriate normalization processes for such variations are required. SL2D-HMMs can perform an elastic matching in both horizontal and vertical directions; this makes it possible to model invariance to size and location. To deal with rotational variations, we introduce additional HMM states which represent the shifts of the state alignments among the observation lines in a particular direction. Face recognition experiments show that the proposed method improves the performance significantly for rotational variation data.

  • Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition Open Access

    Yasuhisa FUJII  Kazumasa YAMAMOTO  Seiichi NAKAGAWA  

     
    PAPER-Speech and Hearing

      Vol:
    E95-D No:8
      Page(s):
    2094-2104

    In this paper, we propose Hidden Conditional Neural Fields (HCNF) for continuous phoneme speech recognition, which are a combination of Hidden Conditional Random Fields (HCRF) and a Multi-Layer Perceptron (MLP), and inherit their merits, namely, the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. HCNF can incorporate many types of features from which non-linear features can be extracted, and is trained by sequential criteria. We first present the formulation of HCNF and then examine three methods to further improve automatic speech recognition using HCNF, which is an objective function that explicitly considers training errors, provides a hierarchical tandem-style feature and includes a deep non-linear feature extractor for the observation function. We show that HCNF can be trained realistically without any initial model and outperforms HCRF and the triphone hidden Markov model trained by the minimum phone error (MPE) manner using experimental results for continuous English phoneme recognition on the TIMIT core test set and Japanese phoneme recognition on the IPA 100 test set.

  • Online Anomaly Prediction for Real-Time Stream Processing

    Yuanqiang HUANG  Zhongzhi LUAN  Depei QIAN  Zhigao DU  Ting CHEN  Yuebin BAI  

     
    PAPER-Network Management/Operation

      Vol:
    E95-B No:6
      Page(s):
    2034-2042

    With the consideration of real-time stream processing technology, it's important to develop high availability mechanism to guarantee stream-based application not interfered by faults caused by potential anomalies. In this paper, we present a novel online prediction technique for predicting some anomalies which may occur in the near future. Concretely, we first present a value prediction which combines the Hidden Markov Model and the Mixture of Expert Model to predict the values of feature metrics in the near future. Then we employ the Support Vector Machine to do anomaly identification, which is a procedure to identify the kind of anomaly that we are about to alarm. The purpose of our approach is to achieve a tradeoff between fault penalty and resource cost. The experiment results show that our approach is of high accuracy for common anomaly prediction and low runtime overhead.

  • A VLSI Architecture with Multiple Fast Store-Based Block Parallel Processing for Output Probability and Likelihood Score Computations in HMM-Based Isolated Word Recognition

    Kazuhiro NAKAMURA  Ryo SHIMAZAKI  Masatoshi YAMAMOTO  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    456-467

    This paper presents a memory-efficient VLSI architecture for output probability computations (OPCs) of continuous hidden Markov models (HMMs) and likelihood score computations (LSCs). These computations are the most time consuming part of HMM-based isolated word recognition systems. We demonstrate multiple fast store-based block parallel processing (MultipleFastStoreBPP) for OPCs and LSCs and present a VLSI architecture that supports it. Compared with conventional fast store-based block parallel processing (FastStoreBPP) and stream-based block parallel processing (StreamBPP) architectures, the proposed architecture requires fewer registers and less processing time. The processing elements (PEs) used in the FastStoreBPP and StreamBPP architectures are identical to those used in the MultipleFastStoreBPP architecture. From a VLSI architectural viewpoint, a comparison shows that the proposed architecture is an improvement over the others, through efficient use of PEs and registers for storing input feature vectors.

  • Robust Gait-Based Person Identification against Walking Speed Variations

    Muhammad Rasyid AQMAR  Koichi SHINODA  Sadaoki FURUI  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E95-D No:2
      Page(s):
    668-676

    Variations in walking speed have a strong impact on gait-based person identification. We propose a method that is robust against walking-speed variations. It is based on a combination of cubic higher-order local auto-correlation (CHLAC), gait silhouette-based principal component analysis (GSP), and a statistical framework using hidden Markov models (HMMs). The CHLAC features capture the within-phase spatio-temporal characteristics of each individual, the GSP features retain more shape/phase information for better gait sequence alignment, and the HMMs classify the ID of each gait even when walking speed changes nonlinearly. We compared the performance of our method with other conventional methods using five different databases, SOTON, USF-NIST, CMU-MoBo, TokyoTech A and TokyoTech B. The proposed method was equal to or better than the others when the speed did not change greatly, and it was significantly better when the speed varied across and within a gait sequence.

  • A Markov-Based Satellite-to-Ground Optical Channel Model and Its Effective Coding Scheme

    Yoshitoshi YAMASHITA  Eiji OKAMOTO  Yasunori IWANAMI  Yozo SHOJI  Morio TOYOSHIMA  Yoshihisa TAKAYAMA  

     
    PAPER-Satellite Communications

      Vol:
    E95-B No:1
      Page(s):
    254-262

    We propose a novel channel model of satellite-to-ground optical transmission to achieve a global-scale high-capacity communication network. In addition, we compose an effective channel coding scheme based on low-density generator matrix (LDGM) code suitable for that channel. Because the first successful optical satellite communication demonstrations are quite recent, no practical channel model has been introduced. We analyze the results of optical transmission experiments between ground station and the Optical Inter-orbit Communications Engineering Test Satellite (OICETS) performed by NICT and JAXA in 2008 and propose a new Markov-based practical channel model. Furthermore, using this model we design an effective long erasure code (LEC) based on LDGM to achieve high-quality wireless optical transmissions.

  • HMM-Based Underwater Target Classification with Synthesized Active Sonar Signals

    Taehwan KIM  Keunsung BAE  

     
    LETTER-Digital Signal Processing

      Vol:
    E94-A No:10
      Page(s):
    2039-2042

    This paper deals with underwater target classification using synthesized active sonar signals. Firstly, we synthesized active sonar returns from a 3D highlight model of underwater targets using the ray tracing algorithm. Then, we applied a multiaspect target classification scheme based on a hidden Markov model to classify them. For feature extraction from the synthesized sonar signals, a matching pursuit algorithm was used. The experimental results depending on the number of observations and signal-to-noise ratios are presented with our discussions.

  • VLSI Architecture of GMM Processing and Viterbi Decoder for 60,000-Word Real-Time Continuous Speech Recognition

    Hiroki NOGUCHI  Kazuo MIURA  Tsuyoshi FUJINAGA  Takanobu SUGAHARA  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E94-C No:4
      Page(s):
    458-467

    We propose a low-memory-bandwidth, high-efficiency VLSI architecture for 60-k word real-time continuous speech recognition. Our architecture includes a cache architecture using the locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, a parallel Gaussian Mixture Model (GMM) architecture based on the mixture level and frame level, a parallel Viterbi architecture, and pipeline operation between Viterbi transition and GMM processing. Results show that our architecture achieves 88.24% required frequency reduction (66.74 MHz) and 84.04% memory bandwidth reduction (549.91 MB/s) for real-time 60-k word continuous speech recognition.

  • Binary Oriented Vulnerability Analyzer Based on Hidden Markov Model

    Hao BAI  Chang-zhen HU  Gang ZHANG  Xiao-chuan JING  Ning LI  

     
    LETTER-Dependable Computing

      Vol:
    E93-D No:12
      Page(s):
    3410-3413

    The letter proposes a novel binary vulnerability analyzer for executable programs that is based on the Hidden Markov Model. A vulnerability instruction library (VIL) is primarily constructed by collecting binary frames located by double precision analysis. Executable programs are then converted into structurized code sequences with the VIL. The code sequences are essentially context-sensitive, which can be modeled by Hidden Markov Model (HMM). Finally, the HMM based vulnerability analyzer is built to recognize potential vulnerabilities of executable programs. Experimental results show the proposed approach achieves lower false positive/negative rate than latest static analyzers.

  • Spectrum Handoff for Cognitive Radio Systems Based on Prediction Considering Cross-Layer Optimization

    Xiaoyu QIAO  Zhenhui TAN  Bo AI  Jiaying SONG  

     
    PAPER

      Vol:
    E93-B No:12
      Page(s):
    3274-3283

    The spectrum handoff problem for cognitive radio systems is considered in this paper. The secondary users (SUs) can only opportunistically access the spectrum holes, i.e. the frequency channels unoccupied by the primary users (PUs). As long as a PU appears, SUs have to vacate the channel to avoid interference to PUs and switch to another available channel. In this paper, a prediction-based spectrum handoff scheme is proposed to reduce the negative effect (both the interference to PUs and the service block of SUs) during the switching time. In the proposed scheme, a hidden Markov model is used to predict the occupancy of a frequency channel. By estimating the state of the model in the next time instant, we can predict whether the frequency channel will be occupied by PUs or not. As a cross-layer design, the spectrum sensing performance parameters false alarm probability and missing detection probability are taken into account to enhance accuracy of the channel occupancy prediction. The proposed scheme will react on the spectrum sensing algorithm parameters while the spectrum handoff performance is significantly affected by them. The interference to the PUs could be reduced obviously by adapting the proposed spectrum handoff scheme, associated with a potential increase of switch delay of SUs. It will also be helpful for SUs to save broadband scan time and prefer an appropriate objective channel so as to avoid service block. Numerical results demonstrate the above performance improvement by using this prediction-based scheme.

  • The Jiggle-Viterbi Algorithm for the RFID Reader Using Structured Data-Encoded Waveforms

    Yung-Yi WANG  Jiunn-Tsair CHEN  

     
    PAPER

      Vol:
    E93-A No:11
      Page(s):
    2108-2114

    Signals received at the interrogator of an RFID system always suffer from various kinds of channel deformation factors, such as the path loss of the wireless channel, insufficient channel bandwidth resulted from the multipath propagation, and the carrier frequency offset between tags and interrogators. In this paper we proposed a novel Viterbi-based algorithm for joint detection of data sequence and compensation of distorted signal waveform. With the assumption that the transmission clock is exactly synchronized at the reader, the proposed algorithm takes advantage of the structured data-encoded waveform to represent the modulation scheme of the RFID system as a trellis diagram and then the Viterbi algorithm is applicable to perform data sequence estimation. Furthermore, to compensate the distorted symbol waveform, the proposed Jiggle-Viterbi algorithm generates two substates, each corresponding to a variant structure waveform with adjustable temporal support, so that the symbol waveform deformation can be compensated and therefore yield a significant better performance in terms of bit error rate. Computer simulations shows that even in the presence of a moderate carrier frequency offset, the proposed approach can work out with an acceptable accuracy on data sequence detection.

  • Acoustic Model Adaptation for Speech Recognition

    Koichi SHINODA  

     
    INVITED PAPER

      Vol:
    E93-D No:9
      Page(s):
    2348-2362

    Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.

  • Efficient Parallel Learning of Hidden Markov Chain Models on SMPs

    Lei LI  Bin FU  Christos FALOUTSOS  

     
    INVITED PAPER

      Vol:
    E93-D No:6
      Page(s):
    1330-1342

    Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.

  • A VLSI Architecture for Output Probability Computations of HMM-Based Recognition Systems with Store-Based Block Parallel Processing

    Kazuhiro NAKAMURA  Masatoshi YAMAMOTO  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER-VLSI Systems

      Vol:
    E93-D No:2
      Page(s):
    300-305

    In this paper, a fast and memory-efficient VLSI architecture for output probability computations of continuous Hidden Markov Models (HMMs) is presented. These computations are the most time-consuming part of HMM-based recognition systems. High-speed VLSI architectures with small registers and low-power dissipation are required for the development of mobile embedded systems with capable human interfaces. We demonstrate store-based block parallel processing (StoreBPP) for output probability computations and present a VLSI architecture that supports it. When the number of HMM states is adequate for accurate recognition, compared with conventional stream-based block parallel processing (StreamBPP) architectures, the proposed architecture requires fewer registers and processing elements and less processing time. The processing elements used in the StreamBPP architecture are identical to those used in the StoreBPP architecture. From a VLSI architectural viewpoint, a comparison shows the efficiency of the proposed architecture through efficient use of registers for storing input feature vectors and intermediate results during computation.

  • A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    Takashi NOSE  Takao KOBAYASHI  

     
    PAPER-Speech and Hearing

      Vol:
    E93-D No:1
      Page(s):
    116-124

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  • An Improved Encoder for Joint Source-Channel Decoder Using Conditional Entropy Constraint

    Moonseo PARK  Seong-Lyun KIM  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E92-B No:6
      Page(s):
    2222-2225

    When the joint source-channel (JSC) decoder is used for source coding over noisy channels, the JSC decoder may invent errors even though the received data is not corrupted by the channel noise, if the JSC decoder assumes the channel was noisy. A novel encoder algorithm has been recently proposed to improve the performance of the communications system under this situation. In this letter, we propose another algorithm based on conditional entropy-constrained vector quantizer to further improve the encoder. The algorithm proposed in this letter significantly improves the performance of the communications system when the hypothesized channel bit error rate is high.

  • Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network

    Mohammad Nurul HUDA  Hiroaki KAWASHIMA  Tsuneo NITTA  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:4
      Page(s):
    671-680

    This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.

  • Mobile Positioning in Mixed LOS/NLOS Conditions Using Modified EKF Banks and Data Fusion Method

    Liang CHEN  Lenan WU  

     
    PAPER-Sensing

      Vol:
    E92-B No:4
      Page(s):
    1318-1325

    A novel method is proposed to track the position of MS in the mixed line-of-sight/non-line-of-sight (LOS/NLOS) conditions in cellular network. A first-order markov model is employed to describe the dynamic transition of LOS/NLOS conditions, which is hidden in the measurement data. This method firstly uses modified EKF banks to jointly estimate both mobile state (position and velocity) and the hidden sight state based on the the data collected by a single BS. A Bayesian data fusion algorithm is then applied to achieve a high estimation accuracy. Simulation results show that the location errors of the proposed method are all significantly smaller than that of the FCC requirement in different LOS/NLOS conditions. In addition, the method is robust in the parameter mismodeling test. Complexity experiments suggest that it supports real-time application. Moreover, this algorithm is flexible enough to support different types of measurement methods and asynchronous or synchronous observations data, which is especially suitable for the future cooperative location systems.

  • Iterative Channel Estimation in MIMO Antenna Selection Systems for Correlated Gauss-Markov Channel

    Yousuke NARUSE  Jun-ichi TAKADA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E92-B No:3
      Page(s):
    922-932

    We address the issue of MIMO channel estimation with the aid of a priori temporal correlation statistics of the channel as well as the spatial correlation. The temporal correlations are incorporated to the estimation scheme by assuming the Gauss-Markov channel model. Under the MMSE criteria, the Kalman filter performs an iterative optimal estimation. To take advantage of the enhanced estimation capability, we focus on the problem of channel estimation from a partial channel measurement in the MIMO antenna selection system. We discuss the optimal training sequence design, and also the optimal antenna subset selection for channel measurement based on the statistics. In a highly correlated channel, the estimation works even when the measurements from some antenna elements are omitted at each fading block.

  • New Rotation-Invariant Texture Analysis Technique Using Radon Transform and Hidden Markov Models

    Abdul JALIL  Anwar MANZAR  Tanweer A. CHEEMA  Ijaz M. QURESHI  

     
    LETTER-Computer Graphics

      Vol:
    E91-D No:12
      Page(s):
    2906-2909

    A rotation invariant texture analysis technique is proposed with a novel combination of Radon Transform (RT) and Hidden Markov Models (HMM). Features of any texture are extracted during RT which due to its inherent property captures all the directional properties of a certain texture. HMMs are used for classification purpose. One HMM is trained for each texture on its feature vector which preserves the rotational invariance of feature vector in a more compact and useful form. Once all the HMMs have been trained, testing is done by picking any of these textures at any arbitrary orientation. The best percentage of correct classification (PCC) is above 98 % carried out on sixty texture of Brodatz album.

21-40hit(95hit)

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.